Tyrosine kinome

ABSTRACT

Protein kinases are important signaling molecules involved in tumorigenesis. Mutational analysis of the human tyrosine kinase gene family (98 genes) identified somatic alterations in −20% of colorectal cancers, with the majority of mutations occurring in NTRK3, FES, GUCY2F and a previously uncharacterized tyrosine kinase gene called MCCK/MLK4. Most alterations were in conserved residues affecting key regions of the kinase domain. These data represent a paradigm for the unbiased analysis of signal transducing genes in cancer and provide useful targets for therapeutic intervention.

This invention was made using finds from the United States government under grants NIH Award CA 43460 and CA 62924. The U.S. government therefore retains certain rights in the invention according to the terms of the grants.

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

FIELD OF THE INVENTION

The invention relates to the field of cancer genetics and therapeutics. In particular, it relates to genetic changes that affect protein kinase gene families or other gene families. These genetic changes are useful in diagnostic, prognostic, drug discovery, and clinical drug testing applications.

BACKGROUND OF THE INVENTION

Tyrosine kinases (TKs) are central regulators of signaling pathways that control differentiation, transcription, cell cycle progression, apoptosis, motility, and invasion (1). Although genetic alterations in a few TK genes have been linked to human cancer (2), most TK genes have not been directly implicated in tumorigenesis. Additionally, it is not known how many or how often members of the TK gene family are altered in any particular cancer type.

BRIEF SUMMARY OF THE INVENTION

In a first embodiment of the invention a method is provided for detecting mutations involved in cancer. Members of a family of genes in a database of human nucleotide sequences are identified based on homology to a known member of the family. Nucleotide sequence differences in a selected region of each of the members of the family of genes are identified in matched pairs of an individual's cancer cells and normal cells. Such differences identify members of heightened interest Additional nucleotide sequence differences in the members of heightened interest are determined, either in one or more additional regions outside of the selected region, or in matched pairs of cancer cells and normal cells of additional individuals, or in both.

Another embodiment of the invention provides a method of screening test substances for use as anti-cancer agents. A test substance is contacted with an activated protein kinase selected from the group consisting of: NTRK3, FES, MCCK/MLK4, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, KDR, FGFR1, and ERBB4. Activity of the activated protein kinase is assayed. A test substance which inhibits the activity of the activated protein kinase is a potential anti-cancer agent.

Another embodiment of the invention provides a method of screening test substances for use as anti-cancer agents. A test substance is contacted with a mutated GUCY2F guanylate cyclase. Activity of the mutated GUCY2F guanylate cyclase is assayed. A test substance which increases the activity of the mutated GUCY2F guanylate cyclase is a potential anti-cancer agent.

Another embodiment of the invention provides an isolated, activated protein kinase. The kinase is selected from the group consisting of: NTRK3, FES, MCCK/MLK4, GUCY2F, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, KDR, FGFR1, and ERBB4.

Another embodiment of the invention provides an isolated, mutated GUCY2F protein.

Still another embodiment of the invention is a method of categorizing cancers. The sequence of one or more protein kinase family members in a sample of a cancer tissue is determined. The one or more members is selected from the group consisting of NTRK3, FES, MCCK/MLK4, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, GUCY2F, KDR, FGFR1, and ERBB4. A somatic mutation of said one or more protein kinase family members is identified in the cancer tissue. The cancer tissue is assigned to a group based on the presence of the somatic mutation.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows detection of mutations in tyrosine kinase genes. Representative examples of mutations in NTRK3 (FIG. 1A) and MCCK/MLK4 (FIG. 1B) identified using the Mutation Explorer software package (SoftGenetics, State College, PA). In each case, the top box contains the sequence chromatogram from tumor DNA, the middle box contains the sequence chromatogram from normal tissue from the same patient, and the lower box contains a computed comparison between the tumor and normal traces displaying a peak at the observed alteration.

FIG. 2 shows distribution of mutations in NTRK3, FES, MCCK/MLK4 and GUCY2F. Arrows indicate location of mutations while boxes represent functional domains.

FIG. 3 shows sequence conservation and location of mutations in altered genes. Alignment of amino acid sequences for (FIG. 3A) NTRK3, (FIG. 3B) FES, NTRK2 and EPHA3, and (FIG. 3C) MCCK/MLK4. Conserved residues are indicated by a dot, while nonconserved residues are indicated by a letter. The positions of identified mutations in each gene are highlighted in yellow, while positions of mutations in MET and BRAF are highlighted in blue. Underlined regions represent the activation loop (subdomain VII and VIII).

DETAILED DESCRIPTION OF THE INVENTION

Any database can be used in the present invention, whether public, subscription, or proprietary to identify members of a family of genes. Databases can be of nucleotide sequences or protein sequences. They can be of genomic sequences or expressed sequence tags or cDNA sequences. Preferably the sequences are human sequences although the same methods can be used for other species. Homology to a known member may be based on limited portions of the known member, such as a catalytic domain or a regulatory domain. Alternatively homology may be based on the whole protein. Any algorithm or program known in the art can be used. Suitable programs are available publicly and commercially, or they can be made by the individual worker in the art.

Nucleotide sequence differences in a family member can be determined between a sample of cancer cells and normal cells. Cancer and normal cells typically are matched pairs, i.e., they are derived from the same individual and optionally from the same organ. Any technique can be used to determine nucleotide sequence differences. Sequencing of genomic DNA or cDNA can be used. Other techniques which detect differences between two sequences can also be used, without limitation. Techniques which detect differences between the encoded proteins can also be used, since a change in the amino acid of a protein indicates that the nucleotide sequence has been changed.

Selected regions of the family members can be initially screened for nucleotide differences. Any basis for selecting a region can be used. Regions can be selected based on knowledge of mutations in similar regions of other proteins, or based on predictions of particularly important domains of the encoded proteins. Examples of important domains include, but are not limited to catalytic domains and regulatory domains.

One method for determining the functional significance of any mutation which is found is to determine the effect that the mutation has on the encoded protein. A synonymous mutation creates no change in the encoded protein and is sometimes termed silent. Such a mutation is less likely to be functionally relevant to cancer than a non-synonymous mutation. One can determine an encoded protein by identifying an mRNA transcribed from a gene containing a mutation and translating the mRNA (or derived cDNA).

Another method for determining functional significance of a mutation is to determine if the mutation affects an evolutionarily conserved amino acid residue. This can be done by aligning sequences of the same protein from different species and assessing which ones are invariant or predominantly so. The mutation is then compared with this determination of evolutionarily conserved residues to identify if the mutation affects such a residue.

Another method for attributing functional significance to a mutation is to determine if it affects an important domain of the protein. Such domains include but are not limited to catalytic domains and regulatory domains. Another index of functional significance of a new mutation can be found by comparing the amino acid residue affected by the new mutation with equivalent residues in other proteins. If mutations have been found affecting the equivalent residues and those mutations have been determined to be associated with disease, then the new mutation is more likely to be functionally significant. The equivalent mutation may be in the positionally equivalent amino acid residue, or in a close neighbor, perhaps within 5, within 3, within 2, or within 1 residue of the positionally equivalent amino acid residue.

The identified protein kinase family members which have been found to harbor cancer-associated mutations can be used to screen test substances for use as anti-cancer agents. The encoded mutant proteins can be isolated from cells and used in vitro in a cell-free assay. Alternatively, cancer cell lines harboring the mutant protein kinase family members can be used. Cells which have been genetically modified to express the encoded mutant protein can also be used. Regardless of the form in which the mutant protein is presented, it can be contacted with a test substance and the affect on enzymatic activity assessed. If the mutant protein is an activated protein kinase, then test substances will desirably inhibit the activity. If the mutant protein is enzymatically less active than its wild-type cognate, then the test substance will desirably restore activity. Although the family members were selected as being homologous to a tyrosine kinase, all family members are not tyrosine kinases. Some phosphorylate other residues of proteins, such as serine and/or threonine. Others contain inactive kinase domains and have other catalytic domains, such as guanylate cyclase activity. Assays for tyrosine, serine, or threonine kinase activity are well known in the art. See, e.g., the HitHunter™ Enzyme Fragment Complementation Assay of Applied Biosystems, Foster City, Calif., Tyrosine Kinase Assay Kits, (Green or Red) of Panvera, Madison Wis. Any such assay can be used. Assays for guanylate cyclase are also well known. One commercially availably assay kit which may be used is a cGMP RIA assay (Amersham, Bucks., UK).

An isolated protein, whether an activated protein kinase or a mutant guanylate cyclase can be obtained from cancer cells which express such proteins. Alternatively, they can be obtained from cells which have been genetically modified to express such cancer-specific forms of the protein. Any means for isolating the enzymes from the cells can be used to form a cell-free preparation. Further purification of the enzymes can be used as desired. Any purification methods known in the art can be used without limitation, including immunoaffinity methods and chromatography methods.

Mutations in the kinase family members taught herein can be used to categorize cancers. Such mutations can be identified in cancer tissue and not in corresponding non-cancer tissue of an individual. This pattern indicates that the mutations are somatic mutations. The cancers can be categorized based on the kinase family member which is mutated, based on the particular mutation in the kinase family member, or based on the residue mutated within the family member. Such categorization can be correlated with mortality data to enable prognosis on the basis of the category. Such categorization can be correlated with recurrence data to enable prognosis on the basis of the category. Such categorization can be correlated with efficacy of a therapeutic agent to enable prescription of drugs for individuals with higher probability of successful treatment. Patients can be assigned to clinical trials on the basis of the categorization of their cancers. Correlations of the categories of cancers are not limited by this list.

EXAMPLES Example 1 Identification of Genes Encoding a Protein Family

Using a combination of hidden Markov models and global homology searches similar to those recently described (3), we identified 98 genes encoding proteins that contained tyrosine kinase domains in the Celera (Rockville, Md.) and public genome databases (4). Seven of these represented previously uncharacterized genes that were identified solely on the basis of sequence similarity to other human tyrosine kinase genes.

Example 2 Initial Screen for Mutations in Catalytic Domain

As an initial screen to evaluate whether these genes were genetically altered in colorectal cancer, we analyzed all exons encoding the predicted kinase domain. This region has been found to harbor the great majority of previously observed tyrosine kinase gene alterations in other cancers (2, 5). A total of 589 exons containing this domain were extracted from genomic databases (6). To identify coding changes in these genes, the identified exons were amplified using polymerase chain reaction on template DNA derived from 35 colorectal cancers and directly sequenced (7). Six of the selected cancer cell lines had deficiencies in mismatch repair (MMR). Inclusion of these cancers allowed identification of genes that might be preferentially implicated in different forms of sporadic colorectal cancer, as was observed with the BRAF kinase (8, 9).

A total of 249 alterations not present in the normal human genome sequence were identified in the cancers. Of these 249, 15 alterations proved to be somatic, while the others were found to be present in normal cells of the same patients. The 15 alterations affected 13 different genes (examples in FIG. 1). One MMR-deficient and one MMR proficient tumor were observed to have mutations in the TGF-β receptor Type II (TGFBR2) gene. These comprised two different transitions affecting the same codon, C to T change at nucleotide position 1582 resulting in a R528C substitution, and G to A at 1583 resulting in a R528H substitution. As the prevalence of mutations in the kinase domain of TGFBR2 is known to be quite rare (10, 11), these data indicated that our methods were sufficiently sensitive to detect mutations even when present at low frequencies.

Example 3 Expanded Screen for Mutations in Other Cancers and/or Domains

The 12 remaining mutant genes were further analyzed for mutations in another 155 colorectal cancers. Two or more additional mutations were found in only four of the genes, and all coding exons of these four genes were then analyzed in all 190 cancers (12). We thereby identified a total of 42 non-synonymous mutations (Table 1). There were six mutations in the neurotrophic receptor NTRK3, four in the feline sarcoma oncogene (FES), ten in the guanylate cyclase 2F gene (GUCY2F), and ten in a predicted tyrosine kinase like gene with no known fuiction, hereafter called MCCK/MLK4. Two additional genes, EPHA3 and NTRK2, had two alterations each (Table 1). Seven of ten mutations in GUCY2F, which is on the X chromosome, were homozygous, while 29 of the 34 alterations in the remaining genes were heterozygous. All of these mutations were shown to be somatic in the cancers that could be assessed; in three of the 42 cases, no normal tissue was available for comparison. TABLE 1 Mutations observed in the tyrosine kinome Celera Genbank Number of Amino Residue Gene Accession Accession Mutations* Nucleotide acid⁺ Properties^(‡) NTRK3 hCT17758 NM_002530 6 A2083G I695V C, K, M G1822A G608S K, M C2278A L760I C, K A2195C K732T K G2192A R731Q K G2192A R731Q K FES hCT23770 NM_002005 4 A2110G M704V C, K, M G2117A R706Q C, K, M G2227A V743M C, K C2283T S759F C, K MCCK/MLK4 hCT6856 NM_032435 10 C781T H261Y C, K C783G H261Q C, K G872A G291E C, K, M C878A A293E C, K, M G888A W296STP K C1408T R470C C C1408T R470C C C1657T R553Stp C A1787T N596I C A1885G K629E GUCY2F hCT11696 NM_001522 10 G673T D225Y G1078A A360T C A1083T Q361H C1170A F390L C G1475A R492H C A1635T R545S K A1872T E624D C, K A2333G E778G K +2T > C Splice site* G3226A V1026M C TGFBR2 hCT17988 NM_003242 2 C1582T R528C C, K G1583A R528H C, K EPHA3 hCT23516 NM_005233 2 T2374C S792P K G2416A D806N C, K NTRK2 hCT18879 NM_006180 2 C2084T T695I C, K G2251A D751N C, K INSRR hCT31077 XM_043563 1 C2863T T985M C, K JAK1 hCT13272 NM_002227 1 A2656A E886K K PDGFRA hCT13252 NM_006206 1 +1G > A Splice site* K EPHA7 hCT23587 NM_004440 1 G2303T S768I K EPHA8 hCT31226 NM_020526 1 G2617A D873N C, K ERBB4 hCT6470 NM 005235 1 C3090G I1030M K *Number of mutations observed in panel of 190 colorectal cancers. For TGFBR2 only the initial panel of 36 tumors was analyzed for mutations. ⁺Amino acid change resulting from mutation. Splice site alterations affected position 2 of the donor splice site of exon 17 of GUCY2F, and position 1 of the donor site of exon 15 of PDGFRA. ^(‡)C, residue is evolutionarily conserved, K, residue is within kinase domain, M, mutation of equivalent residue in other kinases is disease causing.

Example 4 Evidence of Functional Relevance of Mutations

One of the most difficult issues confronting the sequence analysis of cancer genomes is the distinction between functionally relevant and “passenger” mutations. Each of the clonal expansions driving the neoplastic process leads to fixation of any mutation that had previously occurred in the clone's progenitor cell, whether or not the mutation was responsible for the clonal expansion. Several observations support the hypothesis that the six genes mutated more than once among the tumors in our cohort (NTRK3, FES, MCCK/MLK4, GUCY2F, EPHA3, NTRK2) were functional rather than coincidental.

The first observation involved comparison of synonymous vs. non-synonymous alterations identified during sequencing. Synonymous mutations are likely to be passengers, as they would not be expected to exert a selective growth advantage. Only one somatic synonymous mutation was identified in these six genes, yielding a N:S (non-synonymous:synonymous) ratio of 34:1, far higher than the N:S ratio of 2:1 predicted for non-functional mutations (p<1×10⁻⁴).

Second, most of the non-synonymous mutations identified in these genes occurred in conserved residues in key regions in the kinase domain (Table 1, examples in FIG. 2). All mutated residues in FES, two of six mutated residues in NTRK3, eight of ten mutated residues in MCCK/MLK4, five of ten mutated residues in GUCY2F, both mutated residues in NTRK2, and one of two mutated residues in EPHA3 were identical in all species analyzed. Based on comparisons to related tyrosine kinase genes (13, 14), these alterations were predicted to affect residues in functionally important regions of the kinase domain. In NTRK3, three alterations were located in two subdomains predicted to affect kinase activity: the I695V alteration was localized in subdomain VII, while the cluster of R732Q and K733T mutations was directly adjacent to subdomain Vm. These subdomains comprise the activation loop, normally responsible for autoinhibition of tyrosine kinase activity (13, 14). An additional substitution in FES (R706Q), two alterations in MCCK/MLK4 (G291E, A293E), and an alteration in EPHA3 (S792P) also occurred in the activation loop. Mutations in the activation loop have been shown to lead to ligand-independent tyrosine kinase activation in other genes by relief of the autoinhibitory function of these domains (2). Identical alterations at equivalent residues in NTRK2 (D751N) and EPHA3 (D806N), as well as an alteration in FES (V743M) were in subdomain IX, a region known to stabilize the catalytic loop. Finally, two substitutions in MCCK/MLK4 at position 261 were located in subdomain VIB, the catalytic loop of the kinase domain, but did not affect the invariant aspartate and asparagine residues required for phosphoryl transfer.

Many of the mutations we detected corresponded to those previously shown to be functionally mutated in other protein kinase genes (15) (Table 1, examples in FIG. 3). In NTRK3, the I695V mutation corresponded to a homologous position in the MET oncogene that is altered in renal cell carcinoma (5), while G608S represented an equivalent residue that is affected in the RET oncogene in Hirschsprung's disease (16). Two mutations in FES, M704V and R706Q, and two mutations in MCCK/MLK4 corresponded to or were just adjacent to residues that are altered in the BRAF oncogene in a variety of cancers (6), and are in a region surrounded by three previously reported mutations in MET in renal cell carcinoma (5). Additionally, one mutation in EPHA3 (S792P) was just adjacent to a previously reported alteration in MET in renal cell and hepatocellular carcinomas (5). No mutations in GUCY2F corresponded to alterations in other protein kinase genes, but two alterations (E596K and V1026M) were located near equivalent mutations of the homologous GUCY2D gene that is inactivated in Leber's congenital amaurosis (17).

It was of interest to compare the non-synonymous alterations in these six genes with the three synonymous mutations that were discovered in the study (one in the six genes noted above and two others identified during the sequencing of other tyrosine kinase genes). None of the 3 synonymous mutations occurred in residues that had previously been shown to be functionally altered in other cancers or inherited conditions. Moreover, the prevalence of these synonymous mutations, calculated to be 1.1 alterations per Mb (95% confidence interval 0.23 to 3.3 alterations per Mb) was consistent with previous estimates of the prevalence of nonfunctional alterations in tumor DNA (18). In contrast, the prevalence of non-synonymous alterations in the kinase domain of the six analyzed genes was estimated to be significantly higher at 55 alterations per Mb (95% confidence interval 33 to 85 alterations per Mb; p<0.001). We conclude that the three synonymous mutations observed were likely to be passengers while the 34 non-synonymous mutations identified among six genes were likely to be functional.

Based on their positions and analogous mutations in homologous genes, the majority of alterations we observed are expected to act in a dominant fashion, leading to increased kinase activity. In this respect, the observation of nonsense alterations in MCCK/MLK4 is not unprecedented. Truncations in Src and Met resulting in constitutively active kinase activity have been previously reported (19, 20). Interestingly, MCCK/MLK4 contains an SH3 domain whose homolog has been shown to autoinhibit kinase activity (21). Such kinase autoinhibition would be relieved by the nonsense codons between the kinase and SH3-binding domains at the C-terminus that we observed in two cancers.

Example 5 Significance

This study represents the first systematic mutational analysis of any gene family in a human cancer. Despite decades of research on tyrosine kinase genes, only a few of the genes we had found mutated had been previously linked to tumorigenesis. A fusion gene of NTRK3 with ETV6 has been identified in congenital fibrosarcoma (22), and neurotrophin ligands, including those for NTRK2 and NTRK3, appear to stimulate the invasive behavior of at least several cancer types (23, 24). The v-fes transforming oncogene was identified as a causative agent of feline and avian sarcomas (25), but its human equivalent (FES) has not been found to be altered in any human neoplasia. A homolog of MCCK/MLK4, MLK3, as well as a homolog of EPHA3, EPHA1, have transforming abilities in NIH3T3 cells (26, 27), but their roles in tumorigenesis are otherwise unknown. GUCY2F has only been known to function in light-mediated signal transduction in photoreceptor cells of the retina (28), and had not been thought to play a role in any tissue outside the eye. Using quantitative PCR, we found that all four commonly mutated genes, including GUCY2F, were expressed in both primary cancers as well as cell lines derived from the colon (29).

One reason for attempting to identify tyrosine kinase mutations is that the altered proteins provide attractive targets for therapeutic intervention. This has been convincingly demonstrated with STI571 in patients with chronic myelogenous leukemia (30). The number of colorectal cancer patients with mutations in the six tyrosine kinase genes noted above outnumbers the number of patients with CML or with any cancer type previously associated with tyrosine kinase mutations. These results thereby provide substantial new opportunities for drug development. Moreover, future investigation of the pathways through which these kinases act in colorectal cancer may yield new insights into pathogenesis as well as additional drug targets. Personalized therapeutics can be based on the kinases that are mutationally activated in an individual's cancer. Finally, the large scale sequencing-based approach we used to find novel gene mutations can readily be applied to other enzyme-encoding genes in any common tumor type.

While the invention has been described with respect to specific examples including presently preferred modes of carrying out the invention, those skilled in the art will appreciate that there are numerous variations and permutations of the above described systems and techniques that fall within the spirit and scope of the invention as set forth in the appended claims.

References and Notes

-   1. T. Hunter, Philos Trans R Soc Lond B Biol Sci 353, 583-605.     (1998). -   2. P. Blume-Jensen, T. Hunter, Nature 411, 355-65. (2001). -   3. G. Manning, D. B. Whyte, R. Martinez, T. Hunter, S. Sudarsanam,     Science 298, 1912-34. (2002). -   4. All annotated genes present in the draft human genome sequence     (CHGD Assembly 25H, Jun. 19, 2001) were initially analyzed using     both hidden Markov models and global homology searches using the     Panther protein classification system (world wide web domain name:     celera.com) to identify protein families of receptor and     non-receptor tyrosine kinases. To eliminate potential artifactual     clustering of related proteins lacking a kinase domain in these     families, all identified proteins were further analyzed by blast     analysis against the catalytic domain of the SRC protooncogene. From     this analysis, only those proteins showing similarities with an E     score of<1×10⁻¹⁴ were retained. All identified tyrosine kinase genes     are available in Supplemental Table 1. -   5. A. Danilkovitch-Miagkova, B. Zbar, J Clin Invest 109, 863-7.     (2002). -   6. Sequences for all available annotated exons and adjacent intronic     sequences of identified TK genes were extracted from Celera draft     human genome sequence (CHGD Assembly 25H, Jun. 19, 2001) or from     Genbank (world wide web domain name: genbank.nlm.nih.gov). All exons     encoding the catalytic domain of each kinase were identified by     pairwise homology analyses to canonical tyrosine kinase catalytic     domains. -   7. Primers for PCR amplification and sequencing were designed using     the Primer 3 program (world wide web domain name:     www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi), and were     synthesized by MWG (High Point, N.C.) and IDT (Coralville, Iowa).     PCR amplification and sequencing were performed on tumor DNA from     early passage cell lines as previously described(18) using 384     capillary automated sequencing apparatuses (Spectrumedix, State     College, PA). Of the 589 exons extracted, 556 (94%) were     successfully analyzed, each in an average of 33 tumor samples.     Sequence traces were assembled and analyzed to identify potential     genomic alterations using Mutation Explorer software package     (SoftGenetics, State College, PA). Sequences of all primers used for     PCR amplification and sequencing are available in Supplemental Table     2. -   8. H. Davies et al., Nature (Jun. 9, 2002). -   9. H. Rajagopalan et al., Nature 418, 934. (2002). -   10. W. M. Grady et al., Cancer Res 59, 320-4 (1999). -   11. S. J. Kim, Y. H. Im, S. D. Markowitz, Y. J. Bang, Cytokine     Growth Factor Rev 11, 159-68. (2000). -   12. All available annotated exons and adjacent intronic regions were     extracted for NTRK3, FES, GUCY2F and MCCK from the Celera draft     human genome sequence (CHGD Assembly 25H, Jun. 19, 2001). Tumor DNA     from 142 MMR proficient and 48 MMR deficient early—passage     colorectal cancer cell lines passaged in vitro or as xenografts in     nude mice were analyzed for each exon. -   13. S. K. Hanks, T. Hunter, Faseb J 9, 576-96. (1995). -   14. S. R. Hubbard, J. H. Till, Annu Rev Biochem 69, 373-98. (2000). -   15. Altered tyrosine kinase genes identified were aligned to other     protein kinase genes using CLUSTAL and identified alterations were     compared to previously observed mutations reported in the literature     or at the Human Gene Mutation Database at Cardiff University (world     wide web domain name: archive.uwcm.ac.uk/uwcm/mg/hgmd0.html). -   16. M. Sancandi et al., J Pediatr Surg 35, 139-42; discussion 142-3.     (2000). -   17. I. Perrault et al., Eur J Hum Genet 8, 578-82. (2000). -   18. T. L. Wang et al., Proc Natl Acad Sci USA 99, 3076-80. (2002). -   19. R. B. Irby et al., Nat Genet 21, 187-90. (1999). -   20. V. Wallenius et al., Am J Pathol 156, 821-9. (2000). -   21. H. Zhang, K. A. Gallo, J Biol Chem 276, 45598-603. (2001). -   22. S. R. Knezevich, D. E. McFadden, W. Tao, J. F. Lim, P. H.     Sorensen, Nat Genet 18, 184-7. (1998). -   23. S. J. Miknyoczki et al., Int J Cancer 81, 417-27. (1999). -   24. D. Marchetti, D. J. McQuillan, W. C. Spohn, D. D. Carson, G. L.     Nicolson, Cancer Res 56, 2856-63. (1996). -   25. B. Scheijen, J. D. Griffin, Oncogene 21, 3314-33. (2002). -   26. J. Hartkamp, J. Troppmair, U. R. Rapp, Cancer Res 59, 2195-202.     (1999). -   27. M. Nakamoto, A. D. Bergemann, Microsc Res Tech 59, 58-67.     (2002). -   28. K. A. Lucas et al., Pharmacol Rev 52, 375-414. (2000). -   29. Total RNA was isolated from two primary colorectal cancers and     two colorectal cancer cell lines using RNAgents (Promega, Madison,     Wis.) and mRNA was selected using the MessageMaker Reagent Assembly     (Gibco BRL). Single-stranded cDNA was generated using Superscript II     Reverse Transcriptase (Gibco BRL) following the manufacturer's     directions. Mock template preparations were prepared in parallel     without the addition of reverse transcriptase. Quantitative PCR was     performed with an iCycler (Bio-Rad, Hercules, Calif.) using SYBR     Green dye (Molecular Probes, Eugene, Oreg.), as previously described     (31). -   30. B. J. Druker, Cancer Cell 1, 31-6. (2002). -   31. S. Saha et al., Science 294, 1343-6 (Nov. 9, 2001). 

1. A method for detecting mutations involved in cancer, comprising: identifying members of a family of genes in a database of human nucleotide sequences based on homology to a known member of the family; determining nucleotide sequence differences in a selected region of each of the members of the family of genes in matched pairs of an individual's cancer cells and normal cells, said differences identifying members of heightened interest; determining additional nucleotide sequence differences in the members of heightened interest, either in one or more additional regions outside of the selected region, or in matched pairs of cancer cells and normal cells of additional individuals, or in both.
 2. The method of claim 1 wherein the selected region encodes a catalytic domain.
 3. The method of claim 1 wherein the selected region encodes a regulatory domain.
 4. The method of claim 1 further comprising determining whether a nucleotide sequence difference is synonymous or non-synonymous, wherein a non-synonymous mutation is more likely to be functionally relevant to cancer.
 5. The method of claim 4 further comprising determining whether a nucleotide difference affects an evolutionarily conserved amino acid residue, wherein such a difference is more likely to be functionally relevant to cancer.
 6. The method of claim 4 further comprising determining whether a nucleotide difference affects a residue within a catalytic domain, wherein such a difference is more likely to be functionally relevant to cancer.
 7. The method of claim 4 further comprising determining whether a nucleotide difference affects a first amino acid residue which is a positional equivalent of a second amino acid residue in a protein encoded by another member of the family of genes, wherein mutation of said second residue in the protein encoded by another member of the family causes disease, wherein such a difference affecting the first amino acid is more likely to be functionally relevant to cancer.
 8. The method of claim 4 further comprising determining whether a nucleotide difference affects a first amino acid residue which is within 5 amino acid residues of an equivalent of a second amino acid residue in a protein encoded by another member of the family of genes, wherein mutation of said second residue in the protein encoded by another member of the family causes disease, wherein such a difference affecting the first amino acid is more likely to be functionally relevant to cancer.
 9. The method of claim 8 wherein the first amino acid residue is within 3 amino acid residues of the equivalent of the second amino acid residue.
 10. The method of claim 8 wherein the first amino acid residue is within 1 amino acid residues of the equivalent of the second amino acid residue.
 11. A method of screening test substances for use as anti-cancer agents, comprising: contacting a test substance with an activated protein kinase selected from the group consisting of: NTRK3, FES, MCCK, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, KDR, FGFR1, and ERBB4; testing activity of the activated protein kinase, wherein a test substance which inhibits the activity of the activated protein kinase is a potential anti-cancer agent.
 12. The method of claim 11 wherein the activated protein kinase is in a cell.
 13. The method of claim 11 wherein the activated protein kinase is isolated from a cell.
 14. The method of claim 11 wherein the activated protein kinase is in a cell of a cancer cell line.
 15. The method of claim 11 wherein the activated protein kinase is in a cell which has been modified to express the activated protein kinase.
 16. The method of claim 11 wherein the activated protein kinase is NTRK3.
 17. The method of claim 11 wherein the activated protein kinase is FES.
 18. The method of claim 11 wherein the activated protein kinase is MCCK.
 19. The method of claim 11 wherein the activated protein kinase is EPHA3.
 20. The method of claim 11 wherein the activated protein kinase is NTRK2.
 21. The method of claim 11 wherein the activated protein kinase is INSRR.
 22. The method of claim 11 wherein the activated protein kinase is JAK1.
 23. The method of claim 11 wherein the activated protein kinase is PDGFRA.
 24. The method of claim 11 wherein the activated protein kinase is EPHA7.
 25. The method of claim 11 wherein the activated protein kinase is EPHA8.
 26. The method of claim 11 wherein the activated protein kinase is ERBB4.
 27. The method of claim 11 wherein the activated protein kinase is FGFR1.
 28. The method of claim 11 wherein the activated protein kinase is KDR.
 29. The method of claim 16 wherein the activated NTRK3 protein kinase has a mutation from the group consisting of I695V, G608S, L760I, K732T, and R731Q.
 30. The method of claim 17 wherein the activated FES protein kinase has a mutation from the group consisting of M704V, R706Q, V743M, and S759F.
 31. The method of claim 18 wherein the activated MCCK protein kinase has a mutation from the group consisting of H261Y, H261Q, G291E, A293E, W296STP, R470C, R553Stp, N596I, and K629E.
 32. The method of claim 19 wherein the activated EPHA3 protein kinase has a mutation from the group consisting of S792P and D806N.
 33. The method of claim 20 wherein the activated NTRK2 protein kinase has a mutation from the group consisting of T695I and D751N.
 34. The method of claim 21 wherein the activated INSRR protein kinase has a mutation T985M.
 35. The method of claim 22 wherein the activated JAK1 protein kinase has a mutation E886K.
 36. The method of claim 23 wherein the activated PGDFRA protein kinase is the result of a G→A splice site mutation at position 1 of the donor site of exon
 15. 37. The method of claim 24 wherein the activated EPHA7 protein kinase has a mutation S768I.
 38. The method of claim 25 wherein the activated EPHA8 protein kinase has a mutation D873N.
 39. The method of claim 26 wherein the activated ERBB4 protein kinase is I1030M.
 40. The method of claim 27 wherein the activated FGFR1 protein kinase has a mutation A429S.
 41. The method of claim 28 wherein the activated KDR protein kinase has a mutation selected from the group consisting of G800D, R819Stop, and A1073T.
 42. A method of screening test substances for use as anti-cancer agents, comprising: contacting a test substance with a mutated GUCY2F guanylate cyclase; testing activity of the mutated GUCY2F guanylate cyclase, wherein a test substance which increases the activity of the mutated GUCY2F guanylate cyclase is a potential anti-cancer agent.
 43. The method of claim 42 wherein the mutated GUCY2F guanylate cyclase is in a cell.
 44. The method of claim 42 wherein the mutated GUCY2F guanylate cyclase is isolated from a cell.
 45. The method of claim 42 wherein the mutated GUCY2F guanylate cyclase is in a cell of a cancer cell line.
 46. The method of claim 42 wherein the mutated GUCY2F guanylate cyclase is in a cell which has been modified to express the activated protein kinase.
 47. The method of claim 42 wherein the mutated GUCY2F guanylate cyclase has a mutation selected from the group consisting of D225Y, A360T, Q361H, F390L, R492H, R545S, E624D, E778G, the result of a T→C splice site mutation at position 2 of the donor splice site of exon 17, and V1026M.
 48. An isolated, activated protein kinase selected from the group consisting of: NTRK3, FES, MCCK, GUCY2F, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, KDR, FGFR1, and ERBB4.
 49. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is NTRK3.
 50. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is FES.
 51. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is MCCK.
 52. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is EPHA3.
 53. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is NTRK2.
 54. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is INSRR.
 55. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is JAK1.
 56. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is PDGFRA.
 57. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is EPHA7.
 58. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is EPHA8.
 59. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is ERBB4.
 60. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is KDR.
 61. The isolated, activated tyrosine of claim 48 wherein the activated protein kinase is FGFR1.
 62. The isolated, activated tyrosine of claim 49 wherein the activated NTRK3 protein kinase has a mutation from the group consisting of I695V, G608S, L760I, K732T, and R731Q.
 63. The isolated, activated tyrosine of claim 50 wherein the activated FES protein kinase has a mutation from the group consisting of M704V, R706Q, V743M, and S759F.
 64. The isolated, activated tyrosine of claim 51 wherein the activated MCCK protein kinase has a mutation from the group consisting of H261Y, H261Q, G291E, A293E, W296STP, R470C, R470C, R553Stp, N596I, and K629E.
 65. The isolated, activated tyrosine of claim 52 wherein the activated EPHA3 protein kinase has a mutation from the group consisting of S792P and D806N.
 66. The isolated, activated tyrosine of claim 53 wherein the activated NTRK2 protein kinase has a mutation from the group consisting of T695I and D751N.
 67. The isolated, activated tyrosine of claim 54 wherein the activated INSRR protein kinase has a mutation T985M.
 68. The isolated, activated tyrosine of claim 55 wherein the activated JAK1 protein kinase has a mutation E886K.
 69. The isolated, activated tyrosine of claim 56 wherein the activated PGDFRA protein kinase is the result of a G→A splice site mutation at position 1 of the donor site of exon
 15. 70. The isolated, activated tyrosine of claim 57 wherein the activated EPHA7 protein kinase has a mutation S768I.
 71. The isolated, activated tyrosine of claim 58 wherein the activated EPHA8 protein kinase has a mutation D873N.
 72. The isolated, activated tyrosine of claim 59 wherein the activated ERBB4 protein kinase has a mutation I1030M.
 73. The isolated, activated tyrosine of claim 60 wherein the activated KDR protein kinase has a mutation selected from the group consisting of G800D, R819Stop, and A1073T.
 74. The isolated, activated tyrosine of claim 61 wherein the activated FGFR1 protein kinase has a mutation A429S.
 75. An isolated, mutated GUCY2F protein.
 76. The isolated, mutated GUCY2F protein of claim 75 which has a mutation from the group consisting of D225Y, A360T, Q361H, F390L, R492H, R545S, E624D, E778G, the result of a T→C splice site mutation at position 2 of the donor splice site of exon 17, and V1026M.
 77. A method of categorizing cancers, comprising: determining the sequence of one or more protein kinase family members selected from the group consisting of NTRK3, FES, MCCK, EPHA3, NTRK2, INSRR, JAK1, PDGFRA, EPHA7, EPHA8, GUCY2F, KDR, FGFR1, and ERBB4 in a sample of a cancer tissue; identifying a somatic mutation of said one or more protein kinase family members in the cancer tissue; assigning the cancer tissue to a set based on the presence of the somatic mutation.
 78. The method of claim 77 wherein the mutation is one which activates protein kinase activity.
 79. The method of claim 78 wherein the protein kinase family member is NTRK3.
 80. The method of claim 78 wherein the protein kinase family member is FES.
 81. The method of claim 78 wherein the protein kinase family member is MCCK.
 82. The method of claim 78 wherein the protein kinase family member is EPHA3.
 83. The method of claim 78 wherein the protein kinase family member is NTRK2.
 84. The method of claim 78 wherein the protein kinase family member is INSSR.
 85. The method of claim 78 wherein the protein kinase family member is JAK1.
 86. The method of claim 78 wherein the protein kinase family member is PDGFRA.
 87. The method of claim 78 wherein the protein kinase family member is EPHA7.
 88. The method of claim 77 wherein the protein kinase family member is GUCY2F.
 89. The method of claim 78 wherein the protein kinase family member is ERBB4.
 90. The method of claim 78 wherein the protein kinase family member is EPHA8.
 91. The method of claim 78 wherein the protein kinase family member is KDR.
 92. The method of claim 78 wherein the protein kinase family member is FGFR1.
 93. The method of claim 77 wherein the set is used to analyze or design clinical trials.
 94. The method of claim 77 wherein the set is used to correlate with prognostic data.
 95. The method of claim 77 wherein the set is used to correlate with recurrence data.
 96. The method of claim 77 wherein the set is used to select an appropriate therapeutic agent. 